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HUMAN MAFA 

Field of Invention 

This invention relates to polypeptides, nucleotide sequences, antibodies or fragments thereof, 
ligands and compositions and their use in the medical fields of inflammation and allergy, 
disease examples of which include rheumatoid arthritis and asthma. In addition the invention 
relates to a method for production of the polypeptides. Methods of disease treatment are 
suggested relying on agents developed in combination with the cloning of the human MAFA 
molecule. A use of the invention addresses the prevention of cell activation events in vivo 
which could lead to therapies for the prevention of tumour growth. 

Nucleotides and amino acid residues are represented herein by their standard codes as 
identified by the IUPAC-IUB Biochemical Nomenclature Commission and they include all D 
or L amino acids or analogues and derivatives thereof. The symbol X represents an 
unidentified amino acid or analogue thereof. 

Background to Invention 

Mast cells comprise a heterogeneous family of cell types derived from the bone marrow, which 
are mainly found resident in the connective tissue of the skin, lung and gut. Their common 
feature is prominent cytoplasmic granules containing heparin, histamine and proteases, which 
can be released in a process known as degranulation, into the tissues when the cells are 
appropriately activated. Mast cells are gaining recognition as participants in many 
inflammatory responses in addition to their-documented role in anaphylaxis. However, the 
biochemical pathways underlying the ability of extracellular stimuli to activate intracellular 
events still require resolution. After immunological activation via the high-affinity Fc 
receptors (FceRI) for immunoglobulin E (IgE) on the surface of the cell, signal transduction 



WO 98/54209 



PCT/GB98/01572 



2 

pathways are initiated including the tyrosine phosphorylation of cellular proteins, 
phosphoinositide hydrolysis, an increase in intracellular calcium, and protein kinase C 
activation. The mast cell then releases a variety of mediators such as cytokines, lip id-derived 
mediators, amines, proteases and proteoglycans. These early activation events are believed 
to be involved in the release of the mediators. The FceRl receptor is not only expressed on 
mast cells but also on basophils, langerhans cells, monocytes, and eosinophils, although it is 
now recognised that the receptor expressed on langerhan cells and monocytes is missing the 
P chain. 

MAFA 

An abundant cell surface protein was identified on the surface of the rat basophil leukaemic 
cell-line RBL-2H3 as a result of monoclonal antibody screening. The antibody used, G63, was 
later shown to also bind to the surface of mucosal and connnective-tissue mast cells (Ortega 
et al, 1991). The cell surface protein was termed mast cell function-associated antigen 
(MAFA). The cDNA sequence encoding rat MAFA (rMAFA) was isolated by expression 
cloning (Guthman et al t 1995). Rat MAFA is a type II integral membrane glycoprotein that 
has extensive amino acid homology to calcium dependent (C-type) animal lectins. Interestingly, 
C-type lectins have been associated with other immunological cell types, CD72 in B 
lymphocytes, FceRII (CD23), CD69 in T and B lymphocytes, and Ly-49 and NKR-P1 on 
natural killer cells. 

Recently, the gene structure of rat MAFA has been published along with the sequences of two 
alternatively-spliced mRNA transcripts (Bocek et al, 1997). The full length rat MAFA mRNA 
is made up from five exons and one of the alternative transcripts lacks the transmembrane 
exon, exon 2, but maintains the correct reading frame. The other alternatively-spliced 
transcript lacks both exons 2 and 3 and does not maintain the full length rat MAFA reading 
frame. No function has yet been assigned to the alternately spliced rat MAFA variants. 
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Cross-linking of rMAFA on RBL-2H3 cells using G63 monoclonal antibody has been shown 
to prevent IgE-mediated degranulation as well as de novo synthesis and release of interleukin 
6. This molecule is therefore likely to be a negative regulator of mast cell/basophil functional 
effects exerted via the high affinity FceRl receptor. The negative effects of the rMAFA 
molecule on cell function are thought to originate from the cytoplasmic region of the molecule, 
which are the ammo-terminal 34 amino acids. Within this region is a particular sequence [SEQ 
ID No. 23] (YSTL) containing a single copy of the motif YXXL/I [SEQ ID No. 24] found to 
be essential in immunoreceptor tyrosine activation motifs (ITAMS). However, the T-cell 
receptor (TCR), B-cell receptor (BCR) and FceRl are multi-subunit receptors which possess 
ten, four and three ITAMs respectively. Studies on the low affinity IgG receptor FcyRIIB 
have demonstrated that cell activation triggered by the aforementioned immunoreceptors can 
be inhibited if there is receptor coaggregation with FcyRIIB (Daeron et al, 1995; Muta et al 
1994), Fey RUB has a single YXXL/I motif (similar to rat MAFA), responsible for the 
immunoreceptor inhibition, which is now known as an immunoreceptor tyrosine-based 
inhibition motif (ITIM). The ITIM mechanism of action in vivo is uncertain, however it is 
likely that the ITIM tyrosine residue is first phosphorylated by a src-like protein tyrosine 
kinase which allows the recruitment of an SH2-domain containing protein or lipid phosphatase 
which then acts on components of the immunoreceptor signalling cascade (Ono et al, 1996). 
Indeed, changes in the MAFA tyrosyl- and seryl-phosphorylation levels are observed in 
response to G63 binding, antigenic stimulation, and a combination of both treatments. 

Summary of Invention 

The rat MAFA molecule found on both mast cells and basophils has been cloned and shown 
to be a type II membrane glycoprotein with homology to calcium-dependent lectins. 
Alternatively spliced mRNA forms have been described, but the physiological relevance of 
these forms is unknown. 
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In this invention, we have cloned the human MAFA molecule. This molecule is similar to the 
rat form having an intracellular domain containing a putative ITIM motif and the extracellular 
lectin-iike domain, however the amino acid sequence suggests the presence of two additional 
extracellular N-linked glycosylation sites. 

Interestingly, alternative mRNA transcripts that are very different to the rat transcripts have 
been identified. Furthermore, a major transcript, not found in rat, but highly expressed in 
human lung and glanulocyte-enriched blood cells, encodes a putative protein with the MAFA 
intracellular and transmembrane domains followed by an 8 amino acid polyproline motif due 
to a reading frameshift. This unique sequence has been used in the design of agents that can 
be used in the treatment of inflammation or allergy. Specifically, peptides of the generic amino 
acid sequence X-Pro-X-Pro-X-X-Pro [SEQ ID No. 1] were shown to inhibit both T cell antigen 
receptor-dependent activation induced interleukin 2 secretion from human Jurkat T cells and 
IgE-dependent degranulation of RBL cells. Interleukin 2 is an autocrine growth factor for T 
cells. Therefore inhibiting its production prevents T cell proliferation and hence suppresses the 
immune system. 

The sequence of the human form of the MAFA molecule obtained from both the myelogenous 
leukaemic cell line KU812 and cDNA derived from human lung tissue is detailed in Fig. 1. 
Surprisingly, additional truncated forms of MAFA are provided which are expressed in the 
cells and tissues. One prominent form sequenced was found to encode a variant of the 
molecule in which the exon encoding the most N-terminal extracellular region (analogous to 
rat exon 3) was spliced out (huMAFA[E3-]). This phenomenon resulted in a coding amino 
acid frameshift, caused by the addition of an extra guanine nucleotide, resulting in truncation 
of the full length protein after the transmembrane domain. In addition, a new peptide motif 
of eight amino acids was encoded N-terminal to the new stop codon but continuous with the 
transmembrane sequence (fig. 2). 
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A third alternatively spliced huMAFA variant was identified (huMAFA[E3/4-]) which lacked 
the entire C-lectin-like domain but retained the intracellular and transmembrane domains as 
well as the extracellular C-terminal tail (fig 3). 

Interestingly, both forms are membrane bound forms of MAFA. No soluble forms of MAFA 
were found corresponding to the rat MAFA [Exon 2-] 

Previously, an inhibitory function has been proposed for rat MAFA (Guthmann et al t 1995). 
The co-aggregation of MAFA with cross-linked FceRl receptors, together with the suggestion 
that the ITIM motif may allow intracellular binding of a protein or inositol phosphatase, led 
to the hypothesis that MAFA may function as an "off switch in regulating mast cell 
activation. This is accomplished by dephosphorylating the molecules of the FceRl complex 
or membrane lipids which become phosphorylated within seconds of receptor cross-linking. 
Although the extracellular receptor for MAFA is unknown, truncated versions of membrane- 
bound human MAFA could modulate the negative regulatory mechanisms. This is indicated 
by the results shown herein. 

Therapies directed against the truncated forms of the molecule or its production would be 
expected to downregulate mast cell activation, and might therefore be useful in the treatment 
of allergic diseases. Similarly, overproduction of truncated MAFA may be associated with the 
development of atopy, and diagnosis of this could be accomplished using antibodies directed 
against unique C-terminal sequences expressed on the truncated form. Manipulation of the 
production and/or function of truncated MAFA are all encompassed within the scope of the 
invention. 

In a first embodiment the invention provides a polypeptide which comprises or consists of the 
sequence of amino acid residues: 
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X-Pro-X-Pro-X-X-Pro. [SEQ ID No. 1] 



Preferably it comprises or consists of the sequence of amino acid residues selected from the 
group: 



Pro-Pro-Leu-Pro-Gln-X-Pro [SEQ ID No. 2] 
Val-Pro-Val-Pro-Lys-X-Pro [SEQ ID No. 3] 
Gly-Pro-Leu-Pro-Lys-X-Pro [SEQ ID No. 4] 



Ala-Pro-Leu-Pro-His-X-Pro [SEQ ID No. 5] 
Thr-Pro-Leu-Pro-Lys-X-Pro [SEQ ID No. 6] 
Glu-Pro-Ala-Pro-Ser-Phe-Pro-Gln. [SEQ ID No. 7] 



It may comprise or consists of the sequence of amino acid residues corresponding to human 
MAFA or a truncated form thereof. Preferably the truncated form is huMAFA[E3-] or 
huMAFA[E3/4-]. 



As used herein the term "a polypeptide which comprises or consists of the sequence of animo 
acid sequences" means either (i) a polypeptide which includes in its sequence the identified 
sequence "motif" as part of the polypeptide, or; (ii) a polypeptide which is terminated and has 
the sequence of the identified sequence motif. 

For example, a polypeptide of type (i) is cloned human MAFA or a truncated version thereof 
which includes the motif sequence 
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-X-Pro-X-Pro-X-X-Pro- [SEQ ID No. 1] 

For example, a polypeptide of type (ii) is a peptide of formula 

Ac-X-Pro-X-Pro-X-X-Pro-NH 2 

Although polypeptides according to the invention may contain an amino acid motif included 
in a relatively long sequence (such as full length human MAFA) this invention also provides 
relatively short length amino acid sequences of general formula (aa) n wherein n is any integer 
between 7 and 20, preferably between 7 and 10, most preferably 7 or 8. 

By way of example, the 7mer polypeptide of amino acid sequence 

Ac-Pro-Pro-Leu-Pro-Glu-X-Pro-NH 2 [SEQ ID No. 2] 

Consists of the motif sequence 

X-Pro-X-Pro-X-X-Pro 
whereas the 8mer polypeptide of sequence 

Ac-Glu-Pro-Ala-Pro-Ser-Phe-Pro-Glu-NH 2 

includes the same motif. 

In a second embodiment the invention provides a nucleotide sequence which codes for the 
polypeptide sequence. 
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la a third embodiment the invention provides an antibody or fragment thereof specific for an 
epitope of the C terminal extracellular domain sequences expressed on spliced type II C-lectin- 
like membrane proteins or an epitope of the N terminal intracellular domain sequences of type 
II C-lectin-like membrane proteins. Preferably the type II C-lectin-like membrane protein is 
human MAFA or a truncated form thereof. Preferably the truncated form is human 
MAFA[E3-] or human MAFA[E3/4-]. 

In a fourth embodiment the invention provides a ligand specific for a fragment of human 
MAFA which is expressed on the surface of filamentous phage. 

In a fifth embodiment the invention provides a composition comprising a therapeutic amount 
of the polypeptide, antibody or fragment thereof or ligand, together with a pharmaceutically 
acceptable diluent or carrier. 

In a sixth embodiment the invention provides the polypeptide, nucleotide sequence, antibody, 
or fragment thereof, ligand or composition for use as a medicament. Preferably they are used 
in the treatment of inflammatory or allergic diseases or tumour growth. 

In a seventh embodiment the invention provides use of the polypeptide, nucleotide sequence, 
antibody or fragment thereof, ligand or composition in the manufacture of a medicament for 
the treatment of inflammatory or allergic diseases or tumour growth. 

In an eighth embodiment the invention provides a method of treatment for inflammatory or 
allergic diseases which comprises administering an effective dose of the polypeptide, antibody 
or fragment thereof, ligand or composition. 

In a ninth embodiment the invention provides a method of preparing the polypeptide which 
comprises the steps of: 



WO 98/54209 



PCT/GB98/01572 



9 

i) Na-Fmoc deprotection; 

ii) washing; 

iii) coupling of a single amino acid residue or amino acid mixtures; 

iv) washing; 

v) repeating until the desired polypeptide is constructed. 
Detailed Description of the Invention 

The invention will now be described by reference to the accompanying drawings in which: 

Figure 1 shows the nucleotide sequence [SEQ ID No. 8,9] encoding the full-length expressed 
form of human MAFA (nucleotides 1-570). The expected amino acid translation is shown 
beneath the nucleotide sequence. Putative N-linked glycosylation sites are underlined. (The 
two amino acids in italics refer to polymorphic mutations) 

Figure 2 shows the nucleotide sequence and putative amino acid sequence [SEQ ID No. 10,11] 
of 400 bp alternative human transcript (huMAFA[E3-]). Amino acid translation resulting from 
reading frame-shift is shown in bold, (* represents a stop codon so no further transcription 
occurs. Italic amino acids from polymorphic mutations) 

Figure 3 shows the nucleotide sequence and putative amino acid sequence [SEQ ID No. 
12,13] of 301 bp alternative human MAFA transcript (huMAFA[E3/4-]). The nucleotide 
sequence encoding the huMAFA C-terminal region is underlined (Analogous to rat Exon 
5). (Italic amino acids from polymorphic mutations). 

Figure 4 shows the nucleotide and amino acid sequence [SEQ ID No. 14,15] of rat MAFA. 
Putative N-linked glycosylation sites are underlined. 
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Blast program searches using the Internet NCBI software for human C-lectin-like sequences 
identified two expressed sequence tags (ESTs) AA186699 and AA188327 which showed 
some homology to the rat MAFA cDNA sequence. After careful analysis of the EST 
sequences, we designed PCR primers which we predicted represented the 5* and 3* end of 
the human MAFA coding cDNA. PCR using these primers on cDNA made from basophil- 
ia leukaemic cells (KU812s), mast cell-enriched lung cells and basophil-enriched blood 
cells resulted in three different sized PCR DNA products of approximately 580, 400 and 
300 bp. These DNA products were cloned into the sequencing vector pCR-script 
(Stratagene) and sequenced in both the forward and reverse directions using the T7 and T3 
sequencing primers. 

The largest PCR product was shown to represent the full coding sequence for human 
MAFA (fig.l ), a 400 bp.product huMAFA[E3-] (fig. 2.) and a 300 bp product 
huMAFA[E3/4-] (fig. 3). 

The full length human MAFA is one amino acid longer than its rat homologue and 
possesses two additional N-linked glycosylation sites (fig. 1). Two presumed polymorphic 
mutations were detected between samples of nucleotide 95 A-G resulting in a codon change 
of Lys to Arg and nucleotide 124 A-G resulting in a codon change of threonine to analine. 
These changes are quite conservative and probably do not affect structure or function. 

Sequences based on the alternatively spliced human MAFA[E3-] variant 

The human MAFA[E3-] variant has the same putative intracellular and transmembrane 

amino acid sequence as full length MAFA, but following this sequence is the unique 

sequence: 



GIu-Pro-Ala-Pro-Ser-Phe-Pro-Gln. [SEQ ID No. 7] 
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This sequence has been synthesised. 

A library of peptides was constructed to represent the generic structure: 

Ac-X-Pro-X-Pro-X-X-Pro-NH 2 [SEQ ID No. 1] 

These library peptides were tested in whole cell systems for their ability to modulate the 
effects of cell stimulation. Separate peptide mixtures were found that could inhibit T-cell 
antigen receptor dependent interleukin-2 release from human T cells or prevent IgE- 
mediated degranulation of rat basophils: 

Interleukin-2 Inhibitors are included in the following mixtures: 

Ac-Pro-Pro-Leu-Pro-Gln-A/E/F/G/I/L/K/H/N/P/Q/S/T/V/Y-Pro-NH 2 [SEQ ID No. 16] 

Ac-Gly-Pro-Leu-Pro-Lys-A/E/F/G/I/U^ [SEQ ID No. 17] 

Ac-Val-Pro-Val-Pro-Lys- A/E/F/G/I/L/K/H/N/P/Q/S/T/V/Y-Pro-NH 2 [SEQ ID No. 18] 

Degranulation Inhibitors are included in the following mixtures: 

Ac-Ala-Pro-Leu-Pro-His-A/E/F/G/I/L/K7H/N/P/Q/S/T/V/Y-Pro-NH 2 [SEQ ID No. 19] 
Ac-Thr-Pro-Leu-Pro-Lys-A/E/F/G/I/L/K/H/N/P/Q/S/TA^/Y-Pro-NH 2 [SEQ ID No. 20] 

These sequences indicate that the generic sequences for interleukin-2 are: 

Pro-Pro-Leu-Pro-Gln-X-Pro 

Val-Pro-Val-Pro-Lys-X-Pro 



1 
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Gly-Pro-Leu-Pro-Lys-X-Pro 
and the generic sequences for degranulation inhibitors are: 

Ala-Pro-Leu-Pro-His-X-Pro 
Thr-Pro-Leu-Pro-Lys-X-Pro 

Experimental Methods 
Cell Culture 

Ku 812 cells (Kishi, 1985) and Jurkat E17 T cells (Williams et al, 1995) were grown in 
RPMI 1640 (GIBCO) supplemented with 10 % (vol/vol) heat inactivated foetal calf serum, 
50 IU/ml penicillin, 50 /ig/ml streptomycin and 2 mM L-glutamine. RBL cells were grown 
in DMEM (GIBCO) supplemented with 10 % (vol/vol) heat inactivated foetal calf serum, 
50 IU/ml penicillin, 50 /xg/ml streptomycin. Growth of all cells was at 37 °C in humidified 
5 % C0 2 /95 % air. 

Cell Isolation 

Peripheral blood cells obtained in "buffy " coats were fractionated using Ficoll® and washed 
white cell pellet further fractionated using Percoll as described by Raghuprasad (1982) to 
provide basophil-enriched cell populations. Red blood cells were lysed by suspending cell 
pellet twice in 8.29 g/1 NH 4 C1, 0.84 g/1 NaHC0 3( 37,3 pg/l EDTA, pH 7.3. Remaining 
cells were treated as basophil-enriched cells and shown to contain 10-20% basophils, after 
Wright's solution staining of cytospin prepared slide samples.. 

Human lung biopsy samples (100-170 g) were minced finely using scissors and placed in 
enzymatic digestion cocktail (35 mg/ml BSA, fraction V, 0.38 mg/ml Hyaluronidase, 0.25 
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mg/ml Pronase, 0.03 mg/ml Dnase I, 0.75 mg/ml bacterial collagenase in DMEM) at 5 ml 
cocktail/g tissue for 1 hour at 37 °C with agitation. The digest was then filtered through a 
0.75/zm nylon mesh to remove undigested material and washed twice in PBS. Samples of 
these cells were stained using Wrights solution and found to contain 5-10% mast cells. 

RBL degranulation assays 

RBL cells were harvested by scraping and resuspended to lxlO 6 cells/ml in DMEM 
supplemented with 10 % (vol/vol) heat inactivated foetal calf serum, 50 IU/ml penicillin, 
50 fig/ml streptomycin. 50 id cells were added to wells in a 96 well flat-bottomed plate and 
incubated overnight at 37 °C with 50 ^1 200 ng/ml anti DNP-IgE. The medium was then 
replaced with degranulation buffer (phenol red free RPMI 1640 (Gibco), lg glucose , 0.5g 
BSA in 500 ml) containing 4 pM test peptide and 37°C incubation performed for 1 hour. 
DNP-BSA was then added to 100. ng/ml and incubation performed for a further 45 minutes. 
Buffer was then removed from the cells and assayed for hexosaminidase activity. 

Interleukin 2 secretion assay 

/lurkat E17.T cells were harvested from actively growing suspension culture and 
resuspended to 4xl0 6 cells/ml in fresh medium. Cells were plated out in 96 well plates 
wifeestpeptide at 2>M at a concentration of 2xl0 6 cells/ml. Precubation was then 
performed for 1 hour at 37 °C followed by the addition of 2 /ig/ml PHA and 50 ng/ml 
PDBu and overnight 37°C incubation. Medium was then removed and assayed to 
determine the amount of interleukin 2 by ELISA (Genzyme kit). 

Reverse Transcribed-Polymerase Chain Reactions (RT-PCR) 

Messenger RNA (mRNA) was isolated cell pellets using a Pharmacia mRNA isolation kit, 
as described in manufacturers instructions, this was used to make cDNA utilising oligo dT 
primers. Single-stranded DNA 5' and 3' primers were designed to amplify the full human 
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MAFA coding sequence flanked by Bam HI restriction enzyme sites. 
5' primer; 

GCCGGATCCGATGACTGACAGTGTTATTTATTCCATGTTA [SEQ ID No. 21] 
3' primer; 

TAAGGATCCTCAAAGTCTGACCTTCTTACACACCCAGTG [SEQ ID No. 22] 

PGR using these primers and 20 ng template cDNA was performed at 94°C, 2 minutes, then 
35 cycles of 94°C for 15 seconds, 65°C for 30 seconds and 72°C for 45 seconds followed by 
72°C for 5 minutes using Klentaq (Clontech, USA). PCR amplicons were then cloned into 
pCR-script (Stratagene) as described in manufacturers instructions prior to DNA 
sequencing on an applied biosystems DNA sequencer. 

DNA sequence analysis 

DNA sequencing was performed using the Perkin-Elmer Taq polymerase system in 
conjunction with an Applied Biosystems 373 sequencer. Sequence analysis was performed 
using DNAstar and NCBI blast programs. 

Peptide Synthesis 

Peptides were prepared by the multipin synthesis technique which is set out below: 
Preparation ofMultipin Assembly 

Whilst wearing standard plastic gloves, the Fmoc-Rink-DA/MDA macrocrowns are 
assembled (simply clipped) onto stems and slotted into a 8 x 12 stem holder in the desired 
pattern for synthesis. 

Peptides are then prepared as singles or defined equimolar mixtures by repetitive rounds of 
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Na-Fmoc deprotection, washing, coupling of a single aminoacid or aminoacid mixtures, 
washing etc until the desired primary sequences have been constructed. 

Removal of Na-Fmoc Protection 

A 250 ml solvent resistant bath was charged with 200 ml of a 20% piperidine/DMF 
solution. The multipin assembly was added and deprotection allowed to proceed for 30 
minutes. The assembly was then removed and excess solvent removed by brief shaking. 
The assembly is then washed consecutively with (200 ml each), DMF (5 minutes) and 
MeOH (5 minutes, 2 minutes, 2 minutes) and left to air dry for 15 minutes. 

Quantitative UV Measurement ofFmoc Chromophore Release 

A 1 cm path length UV cell was charged with 1.2 ml of a 20% piperidine/DMF solution 
and used to zero the absorbance of the UV spectrometer at a wavelength of 290nm. A UV 
standard was then prepared consisting of 5.0 mg Fmoc-Asp(OBut)-Pepsyn KA (0.08 
mmol/g) in 3.2 ml of a 20% piperidine/DMF solution. This standard gives Abs 290 = 0.55- 
0.65 (at room temperature) . An aliquot of the multipin deprotection solution was then 
diluted as appropriate to give a theoretical Abs 290 = 0.6, and this value compared with the 
actual experimentally measured absorbance showing the efficiency of previous coupling 
reaction. 

Coupling of Standard Amino Acid Residues 

Coupling reactions were performed by charging the appropriate wells of a polypropylene 96 
well plate with the pattern of activated solutions required during a particular round of 
coupling. Macrocrown (approx 7 jimole) standard couplings were performed in DMF (500 
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Coupling o f an Amino-acid Residue To Appropriate Well 

Whilst the multipin assembly was drying, the appropriate Na-Fmoc amino acid pfp esters 
(10 equivalents calculated from the loading of each crown) and HOBt (10 equivalents) 
required for the particular round of coupling are accurately weighed into suitable 
containers. Alternatively, the appropriate Na-Fmoc amino acids (10 equivalents calculated 
from the loading of each crown), desired coupling agent e.g. HBTU (9.9 equivalents 
calculated from the loading of each crown) and activation e.g. HOBt (9.9 equivalents 
calculated from the loading of each crown), NMM (19.9 equivalents calculated from the 
loading of each crown) are accurately weighed into suitable containers. 

The protected and activated Fmoc amino acid derivatives are then dissolved in DMF (500 
fA for each macrocrown e.g. for 20 macrocrowns, 20 x 10 eq. x 7 /xmoles of derivative 
would be dissolved in 10 000 p\ DMF). The appropriate derivatives were then dispensed to 
the appropriate wells ready for commencement of the 'coupling cycle'. As a standard, 
coupling reactions were allowed to proceed for 6 hours. The coupled assembly was then 
washed as detailed below. 

Equimolar Coupling Of An Amino Acid Residue Mixture 

Equimolar coupling reactions were performed by charging the appropriate wells of a 
polypropylene 96 well plate with the pattern of activated solutions required during a 
particular round of coupling. The equimolar coupling cycle is a 3 stage cycle consisting of:- 

0.98eq coupling overnight, i.e. for the equimolar addition of 15 residues, 0.98 / 15 = 
0.0653eq of each residue is weighed and activated as a single mixture. 



Repeat of 1) 
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A 9.8eq coupling for 3hrs, i.e. for the equimolar addition of 15 residues, 9,8 / 15 = 
0.653eq of each residue is weighed and activated as a single mixture 

Equimolar Coupling o f an Amino-acid Residue Mixture To Appropriate Well 

Whilst the multipin assembly was drying, the appropriate Na-Fmoc amino acid pfp esters 
and HOBt required for the particular round of equimolar coupling were accurately 
weighed into suitable containers (see above for mixture composition). Alternatively, the 
appropriate Na-Fmoc amino acids, desired coupling agent e.g. HBTU and activation e.g. 
HOBT , NMM were accurately weighed into suitable containers (see above for mixture 
composition). 

The protected and activated Fmoc amino acid derivatives are then dissolved in DMF (500 
/zl for each macrocrown e.g. for 20 rnacrocrowns, 20 x 10 eq. x 7 /xmoles of derivative 
was dissolved in 10 000 yl DMF). The appropriate derivatives were then dispensed to the 
appropriate wells ready for commencement of the Coupling cycle'. The standard equimolar 
coupling procedure is outlined above. The coupled assembly was then washed as detailed 
below. 

Washing Following Coupling 

If a 20% piperidine/DMF deprotection was to immediately follow the coupling cycle, then 
the multipin assembly was briefly shaken to remove excess solvent washed consecutively 
with (200 ml each), MeOH (5 minutes) and DMF (5 minutes) and de-protected (see 6.2). If 
the multipin assembly was to be stored or reacted further, then a full washing cycle 
consisting brief shaking then consecutive washes with (200 ml each), DMF (5 minutes) and 
MeOH (5 minutes, 2 minutes, 2 minutes) was performed. 
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Acidolytic Mediated Cleavage of Peptide-Pin Assembly 

Acid mediated cleavage protocols were strictly performed in a fume hood. A polystyrene 96 
well plate (1 ml/well) was labelled, then the tare weight measured to the nearest mg. 
Appropriate wells were then charged with a trifluoroacetic acid/triethylsilane (95:5, v/v, 
600 pi) cleavage solution, in a pattern corresponding to that of the multipin assembly to be 
cleaved. 

The multipin assembly was added, the entire construct covered in tin foil and left for 2 
hours. The multipin assembly was then added to another polystyrene 96 well plate (1 
ml/ well) containing trifluoroacetic acid/trirthylsilane (95:5, v/v, 600 /il) (as above) for 5 
minutes. 

Work up of Cleaved Peptides 

The primary polystyrene cleavage plate (2 hour cleavage) and the secondary polystyrene 
plate (5 minute wash) were then placed in the SpeedVac and the solvents removed 
(minimum drying rate) for 90 minutes. 

The contents of the secondary polystyrene plate were transferred to their corresponding 
wells on the primary plate using an acetonitrile/water/acetic acid (50:45:5, v/v/v) solution 
(3 x 150 fil) and the spent secondary plate discarded. 

Analysis of Products 

A 5/iL aliquot from each well was diluted to 100 yl with 0. 1 % aq. TFA, then a 10^L 
aliquot from this plate diluted with a further 100 /d 0,1% aq. TFA. The double diluted 
plate was analysed by HPLC-MS. 



WO 98/54209 PCT/GB98/01572 



19 

Final Lyophilisation of Peptides 

The plate was covered with tin foil, held to the plate with an elastic band. A pin prick was 
placed in the foil directly above each well and the plate placed at -80°C for 30 minutes. The 
plate was then lyophilised on the 'Heto freeze drier' overnight. 

Finally, the dried plate was weighed. The total cleaved peptide was quantified (by weight) 
and the average content of each peptide calculated. Since all the peptides present originated 
from the same peptide-pin assembly, cleaved under identical conditions, it is reasonable to 
assume that the contents of each well were approximately equimolar. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: 

(A) NAME: PEPTIDE THERAPEUTICS LIMITED 

(B) STREET: 321 CAMBRIDGE SCIENCE PARK 

(C) CITY : CAMBRIDGE 

(D) STATE: CAMBRIDGE 

(E) COUNTRY: ENGLAND 

(F) POSTAL CODE (ZIP) : CB4 4WG 

(G) TELEPHONE: 01223 423333 

(H) TELEFAX: 01223 423111 

(ii) TITLE OF INVENTION: Human MAFA 
(iii) NUMBER OF SEQUENCES : 24 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.3 0 (EPO) 



(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1: 

Xaa Pro Xaa- Pro Xaa Xaa Pro 
1 5 

(2) INFORMATION FOR SEQ ID NO : 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2: 

Pro Pro Leu Pro Gin Xaa Pro 
1 5 

(2) INFORMATION FOR SEQ ID NO : 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

Val Pro Val Pro Lys Xaa Pro 
1 5 

(2) INFORMATION FOR SEQ ID NO : 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Gly Pro Leu Pro Lys Xaa Pro 
1 5 

(2) INFORMATION FOR SEQ ID NO : 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO 

Ala Pro Leu Pro His Xaa Pro 
1 5 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO 

Thr Pro Leu Pro Lys Xaa Pro 
1 5 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO 

Glu Pro Ala Pro Ser Phe Pro Gin 
1 5 

( 2 ) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 570 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION : 1 , .567 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

ATG ACT GAC AGT GTT ATT TAT TCC ATG TTA GAG TTG CCT ACG GCA ACC 48 

Met Thr Asp Ser Val He Tyr Ser Met Leu Glu Leu Pro Thr Ala Thr 
15 io is 

CAA GCC CAG AAT GAC TAC GGA CCA CAG CAA AAA TCT TCC TCT TCC AAG 96 

Gin Ala Gin Asn Asp Tyr Gly Pro Gin Gin Lys Ser Ser Ser Ser Lys 
20 25 30 

CCT, TCT TGT TCT TGC CTT GTG GCA ATA ACT TTG GGG CTT CTG ACT GCA 144 

Pro Ser Cys Ser Cys Leu Val Ala He Thr Leu Gly Leu Leu Thr Ala 
35 40 45 

GTT CTT CTG AGT GTG CTG CTA TAC CAG TGG ATC CTG TGC CAG GGC TCC 192 

Val Leu Leu Ser Val Leu Leu Tyr Gin Trp He Leu Cys Gin Gly Ser 
50 55 60 

AAC TAC TCC ACT TGT GCC AGC TGT CCT AGC TGC CCA GAC CGC TGG ATG 24 0 

Asn Tyr Ser Thr Cys Ala Ser Cys Pro Ser Cys Pro Asp Arg Trp Met 
65 70 75 80 

AAA TAT GGT AAC CAT TGT TAT TAT TTC TCA GTG GAG GAA AAG GAC TGG 288 

Lys Tyr Gly Asn His Cys Tyr Tyr Phe Ser Val Glu Glu Lys Asp Trp 
85 90 95 

AAT TCT AGT CTG GAA TTC TGC CTA GCC AGA GAC TCA CAC CTC CTT GTG 336 

Asn Ser Ser Leu Glu Phe Cys Leu Ala Arg Asp Ser His Leu Leu Val 
100 105 110 

ATA ACG GAC AAT CAG GAA ATG AGC CTG CTC CAA GTT TTC CTC AGT GAG 384 

He Thr Asp Asn Gin Glu Met Ser Leu Leu Gin Val Phe Leu Ser Glu 

115 120 125 

GCC TTT TGC TGG ATT GGT CTG AGG AAC AAT TCT GGC TGG AGG TGG GAA 432 

Ala Phe Cys Trp He Gly Leu Arg Asn Asn Ser Gly Trp Arg Trp Glu 
130 135 140 

GAC GGA TCA CCT CTA AAC TTC TCA AGG ATT TCT TCT AAT AGC TTT GTG 4 80 

Asp Gly Ser Pro Leu Asn Phe Ser Arg He Ser Ser Asn Ser Phe Val 
145 150 155 160 

CAG ACA TGC GGT GCC ATC AAC AAA AAT GGT CTT CAA GCC TCA AGC TGT 528 

Gin Thr Cys Gly Ala He Asn Lys Asn Gly Leu Gin Ala Ser Ser Cys 
165 170 175 

GAA GTT CCT TTA CAC GGG GTG TGT AAG AAG GTC AGA CTT TGA 570 

Glu Val Pro Leu His Gly Val Cys Lys Lys Val Arg Leu 
180 185 



(2) INFORMATION FOR SEQ ID NO: 9: 



(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 18 9 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Met Thr Asp Ser Val lie Tyr Ser Met Leu Glu Leu Pro Thr Ala Thr 
15 10 15 

Gin Ala Gin Asn Asp Tyr Gly Pro Gin Gin Lys Ser Ser Ser Ser Lys 
20 25 30 

Pro Ser Cys Ser Cys Leu Val Ala lie Thr Leu Gly Leu Leu Thr Ala 
35 40 45 

Val Leu Leu Ser Val Leu Leu Tyr Gin Trp lie Leu Cys Gin Gly Ser 
50 55 60 

Asn Tyr Ser Thr Cys Ala Ser Cys Pro Ser Cys Pro Asp Arg Trp Met 
65 70 75 80 

Lys Tyr Gly Asn His Cys Tyr Tyr Phe Ser Val Glu Glu Lys Asp Trp 
85 90 95 

Asn Ser Ser Leu Glu Phe Cys Leu Ala Arg Asp Ser His Leu Leu Val 
100 105 110 

lie Thr Asp Asn Gin Glu Met Ser Leu Leu Gin Val Phe Leu Ser Glu 
115 120 125 

Ala Phe Cys Trp lie Gly Leu Arg Asn Asn Ser Gly Trp Arg Trp Glu 
130 135 140 

Asp Gly Ser Pro Leu Asn Phe Ser Arg lie Ser Ser Asn Ser Phe Val 
145 150 155 160 

Gin Thr Cys Gly Ala lie Asn Lys Asn Gly Leu Gin Ala Ser Ser Cys 
165 170 175 

Glu Val Pro Leu His Gly Val Cys Lys Lys Val Arg Leu 
180 185 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 99 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) 



FEATURE : 
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(A) NAME/ KEY: CDS 

(B) LOCATION : 1 . .210 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

ATG ACT GAC AGT GTT ATT TAT TCC ATG TTA GAG TTG CCT ACG GCA ACC 4 8 

Met Thr Asp Ser Val lie Tyr Ser Met Leu Glu Leu Pro Thr Ala Thr 
190 195 200 205 

CAA GCC CAG AAT GAC TAC GGA CCA CAG CAA AAA TCT TCC TCT TCC AGG 96 
Gin Ala Gin Asn Asp Tyr Gly Pro Gin Gin Lys Ser Ser Ser Ser Arg 
210 215 220 

CCT TCT TGT TCT TGC CTT GTG GCA ATA GCT TTG GGG CTT CTG ACT GCA 144 
Pro Ser Cys Ser Cys Leu Val Ala lie Ala Leu Gly Leu Leu Thr Ala 
225 230 235 

GTT CTT CTG AGT GTG CTG CTA TAC CAG TGG ATC CTG TGC CAG GAG CCT 192 
Val Leu Leu Ser Val Leu Leu Tyr Gin Trp lie Leu Cys Gin Glu Pro 
240 245 250 

GCT CCA AGT TTT CCT CAG TGAGGCCTTT TGCTGGATTG GTCTGAGGAA 240 
Ala Pro Ser Phe Pro Gin 
255 

CAATTCTGGC TGGAGGTGGG AAGACGGATC ACCTCTAAAC TTCTCAAGGA TTTCTTCTAA 300 

TAGCTTTGTG CAGACATGCG GTGCCATCAA CAAAAATGGT CTTCAAGCCT CAAGCTGTGA 360 

AGTTCCTTTA CACTGGGTGT GTAAGAAGGT CAGACTTTG 399 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 70 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Met Thr Asp Ser Val lie Tyr Ser Met Leu Glu Leu Pro Thr Ala Thr 
15 10 15 

Gin Ala Gin Asn Asp Tyr Gly Pro Gin Gin Lys Ser Ser Ser Ser Arg 
20 25 30 

Pro Ser Cys Ser Cys Leu Val Ala lie Ala Leu Gly Leu Leu Thr Ala 
35 40 45 



Val Leu Leu Ser Val Leu Leu Tyr Gin Trp lie Leu Cys Gin Glu Pro 
50 55 60 



WO 98/54209 PCT/GB98/01572 

26 

Ala Pro Ser Phe Pro Gin 

65 " 70 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 300 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION : 1 . .2 97 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

ATG ACT GAC AGT GTT ATT TAT TCC ATG TTA GAG TTG CCT ACG GCA ACC 4 8 

Met Thr Asp Ser Val lie Tyr Ser Met Leu Glu Leu Pro Thr Ala Thr 
75 80 85 

CAA GCC CAG AAT GAC TAC GGA CCA CAG CAA AAA TCT TCC TCT TCC AGG 96 
Gin Ala Gin Asn Asp Tyr Gly Pro Gin Gin Lys Ser Ser Ser Ser Arg 
90 95 100 

CCT TCT TGT TCT TGC CTT GTG GCA ATA GCT TTG GGG CTT CTG ACT GCA 144 
Pro Ser Cys Ser Cys Leu Val Ala lie Ala Leu Gly Leu Leu Thr Ala 
105 110 115 

GTT CTT CTG AGT GTG CTG CTA TAC CAG TGG ATC CTG TGC CAG GGG ATT 192 
Val Leu Leu Ser Val Leu Leu Tyr Gin Trp lie Leu Cys Gin Gly lie 
120 125 130 

TCT TCT AAT AGC TTT GTG CAG ACA TGC GGT GCC ATC ACC AAA AAT GGT 24 0 

Ser Ser Asn Ser Phe Val Gin Thr Cys Gly Ala He Thr Lys Asn Gly 
135 140 145 150 

CTT CAA GCC TCA AGC TGT GAA GTT CCT TTA CAC TGG GTG TGT AAG AAG 288 
Leu Gin Ala Ser Ser Cys Glu Val Pro Leu His Trp Val Cys Lys Lys 
155 160 165 

GTC AGA CTT TGA 300 
Val Arg Leu 



(2) INFORMATION FOR SEQ ID NO: 13: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 99 amino acids 

(B) TYPE: amino acid 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

Met Thr Asp Ser Val He Tyr Ser Met Leu Glu Leu Pro Thr Ala Thr 
15 10 15 

Gin Ala Gin Asn Asp Tyr Gly Pro Gin Gin Lys Ser Ser Ser Ser Arg 
20 25 30 

Pro Ser Cys Ser Cys Leu Val Ala He Ala Leu Gly Leu Leu Thr Ala 
35 40 45 

Val Leu Leu Ser Val Leu Leu Tyr Gin Trp He Leu Cys Gin Gly He 
50 55 60 

Ser Ser Asn Ser Phe Val Gin Thr Cys Gly Ala He Thr Lys Asn Gly 
65 70 75 80 

Leu Gin Ala Ser Ser Cys Glu Val Pro Leu His Trp Val Cys Lys Lys 
85 90 . 95 

Val Arg Leu 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 567 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION : 1 . . 564 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

ATG GCC GAC AAC TCT ATC TAC TCA ACA TTA GAG CTG CCT GCT GCA CCT 4 8 

Met Ala Asp Asn Ser lie Tyr Ser Thr Leu Glu Leu Pro Ala Ala Pro 

100 105 110 115 

CGA GTC CAA GAT GAC TCC AG A TGG AAG GTC AAA GCT GTC TTA CAC CGA 96 

Arg Val Gin Asp Asp Ser Arg Trp Lys Val Lys Ala Val Leu His Arg 
120 125 130 



CCC TGT GTT TCC TAC CTT GTG ATG GTG GCT TTG GGG CTT TTG ACT GTG 
Pro Cys Val Ser Tyr Leu Val Met Val Ala Leu Gly Leu Leu Thr Val 
135 140 145 



144 
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ATT CTC ATG AGT CTA CTG TTG TAC CAA CGG ACT CTG TGC TGT GGC TCC 192 

lie Leu Met Ser Leu Leu Leu Tyr Gin Arg Thr Leu Cys Cys Gly Ser 
150 155 160 

AAG GGC TTT ATG TGT TCC CAG TGC TCC AGG TGC CCC AAC CTC TGG ATG 24 0 

Lys Gly Phe Met Cys Ser Gin Cys Ser Arg Cys Pro Asn Leu Trp Met 

165 170 175 

AGG AAC GGG AGC CAC TGT TAC TAC TTC TCA ATG GAG AAA AGG GAC TGG 288 

Arg Asn Gly Ser His Cys Tyr Tyr Phe Ser Met Glu Lys Arg Asp Trp 

180 185 190 195 

AAC TCT AGT CTG AAG TTC TGT GCA GAC AAA GGC TCG CAT CTC CTT ACA 33 6 

Asn Ser Ser Leu Lys Phe Cys Ala Asp Lys Gly Ser His Leu Leu Thr 
200 205 210 

TTT CCG GAC AAC CAG GGA GTG AAC CTG TTC CAG GAG TAT GTG GGC GAG 3 84 

Phe Pro Asp Asn Gin Gly Val Asn Leu Phe Gin Glu Tyr Val Gly Glu 
215 220 225 

GAC TTT TAC TGG ATT GGC TTG AGG GAC ATC GAT GGC TGG AGG TGG GAA 432 

Asp Phe Tyr Trp lie Gly Leu Arg Asp lie Asp Gly Trp Arg Trp Glu 
230 235 240 

GAT GGC CCA GCT CTC AGC TTA AGC ATT CTC TCT AAC AGC GTG GTA CAG 4 80 

Asp Gly Pro Ala Leu Ser Leu Ser lie Leu Ser Asn Ser Val Val Gin 

245 250 255 

AAG TGT GGC ACC ATC CAC AGG TGT GGC CTC CAC GCC TCC AGT TGT GAG 52 8 

Lys Cys Gly Thr lie His Arg Cys Gly Leu His Ala Ser Ser Cys Glu 

260 " 265 270 275 

GTT GCT TTG CAG TGG ATC TGT GAG AAG GTC CTG CCC TGA 567 

Val Ala Leu Gin Trp lie Cys Glu Lys Val Leu Pro 
280 285 



(2) INFORMATION FOR SEQ ID NO: 15:, 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 188 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Met Ala Asp Asn Ser lie Tyr Ser Thr Leu Glu Leu Pro Ala Ala Pro 
15 10 15 

Arg Val Gin Asp Asp Ser Arg Trp Lys Val Lys Ala Val Leu His Arg 
20 25 30 



Pro Cys Val Ser Tyr Leu Val Met Val Ala Leu Gly Leu Leu Thr Val 
35 40 45 



WO 98/54209 PCT/GB98/01572 

29 

He Leu Met Ser Leu Leu Leu Tyr Gin Arg Thr Leu Cys Cys Gly Ser 
50 55 60 

Lys Gly Phe Met Cys Ser Gin Cys Ser Arg Cys Pro Asn Leu Trp Met 
65 70 75 80 

Arg Asn Gly Ser His Cys Tyr Tyr Phe Ser Met Glu Lys Arg Asp Trp 
85 90 95 

Asn Ser Ser Leu Lys Phe Cys Ala Asp Lys Gly Ser His Leu Leu Thr 
100 105 110 

Phe Pro Asp Asn Gin Gly Val Asn Leu Phe Gin Glu Tyr Val Gly Glu 
115 120 125 

Asp Phe Tyr Trp He Gly Leu Arg Asp He Asp Gly Trp Arg Trp Glu 
130 135 140 

Asp Gly Pro Ala Leu Ser Leu Ser He Leu Ser Asn Ser Val Val Gin 
145 150 155 160 

Lys Cys Gly Thr lie His Arg Cys Gly Leu His Ala Ser Ser Cys Glu 
165 . 170 175 

Val Ala Leu Gin Trp He Cys Glu Lys Val Leu Pro 
180 185 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(ix) FEATURE: 

(A) NAME/ KEY : Modif ied-site 

(B) LOCATION: 6 

(D) OTHER INFORMATION; /note = M Xaa at postion 6 is 
selected from the group which comprises 
A,E,F,G,I,L,K,H,N,P,Q,S,T,V,Y. » 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Pro Pro Leu Pro Gin Xaa Pro 
1 5 

(2) INFORMATION FOR SEQ ID NO: 17: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 7 amino acids 
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(B) TYPE: amino acid 

<C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(ix) FEATURE: 

(A) NAME/ KEY : Modif ied-site 

(B) LOCATION: 6 

(D) OTHER INFORMATION :/note= "Xaa at postion 6 is 
selected from the group which comprises 
A,E,F,G, I,L,K,H,N,P,Q,S,T, V,Y. » 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

Gly Pro Leu Pro Lys Xaa Pro 
1 5 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(ix) FEATURE : 

(A) NAME / KEY : Modif ied-site 

(B) LOCATION: 6 

(D) OTHER INFORMATION :/note= "Xaa at position 6 is 

selected from the group which comprises AEFGILKHNPQSTVY " 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 18: 

Val Pro Val Pro Lys Xaa Pro 
1 S 

£2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(ix) FEATURE: 
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(A) NAME /KEY : Modif ied-site 

(B) LOCATION: 6 

(D) OTHER INFORMATION :/note= "Xaa at postion 6 is 
celected f rom the group which comprises 
A,E,F,G,i,L, K,H,N,P,Q,S,T,V,Y. « 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 19: 

Ala Pro Leu Pro His Xaa Pro 
1 5 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(ix) FEATURE: 

(A) NAME/ KEY : Modif ied-site 

(B) LOCATION: 6 

(D) OTHER INFORMATION :/note= "Xaa at position 6 is 
selected from the group which comprises 
A , E , F,G, I, L,K,H,N, P,Q,S,T,V,Y. " 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

Thr Pro Leu Pro Lys Xaa Pro 
1 5 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 21: 
GCCGGATCCG ATGACTGACA GTGTTATTTA TTCCATGTTA 
(2) INFORMATION FOR SEQ ID NO: 22: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 
TAAGGATCCT CAAAGTCTGA CCTTCTTACA CACCCAGTG 3 9 

(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

Tyr Ser Thr Leu 
1 

(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(ix) FEATURE: 

(A) NAME /KEY : Modif ied-site 

(B) LOCATIONS 

(D) OTHER INFORMATION : /note* "Xaa at position 4 is 

selected from the group which comprises L and..." 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 



Tyr Xaa Xaa Xaa 
1 
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Claims 

1. A polypeptide which comprises or consists of the sequence of amino acid residues: 

X-Pro-X-Pro-X-X-Pro. 

2. A polypeptide according to claim 1 which comprises or consists of the sequence of 
amino acid residues selected from the group: 

Pro-Pro-Leu-Pro-Gln-X-Pro 

Val-Pro-Val-Pro-Lys-X-Pro 

Gly-Pro-Leu-Pro-Lys-X-Pro 

Ala-Pro-Leu-Pro-His-X-Pro 

Thr-Pro-Leu-Pro-Lys-X-Pro 

Glu-Pro-Ala-Pro-Ser-Phe-Pro-Gln. 

3. A polypeptide according to claim 1 or 2 which consists of a sequence of 7 to 20, 
preferably 7 to 10, more preferably 7 or 8 amino acid residues. 

4. A polypeptide which comprises or consists of the sequence of amino acid residues 
corresponding to a truncated form of human MAFA. 



5. 



A polypeptide according to claim 4 wherein the truncated human MAFA is 
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huMAFA[E3-] or huMAFA[E3/4-]. 

6. A polypeptide which comprises or consists of the sequence of amino acid residues 
corresponding to human MAFA. 

7. A nucleotide sequence which codes for the polypeptide sequence of any one of 
claims 1 to 6. 

8. An antibody or fragment thereof specific for an epitope of the C terminal 
extracellular domain sequences expressed on spliced type II C-lectin-like membrane 
proteins. 

9. An antibody or fragment thereof specific for an epitope of the N terminal 
intracellular domain sequences of type II C-lectin-like membrane proteins. 

10. An antibody or fragment thereof according to claim 8 or 9 wherein the type II C- 
lectin-like membrane protein is human MAFA. 

1 1 . An antibody or fragment thereof according to claim 8, 9 or 10 wherein the protein 
is^uman MAFA[E3-}or human MAFA[E3/4-]. 

12. A ligand specific for a fragment of human MAFA which is expressed on the surface 
of filamentous phage. 

13. A composition comprising a therapeutic amount of a polypeptide of claims 1 to 6; 
antibody or fragment thereof of claims 8 to 1 1; or ligand of claim 12, together with a 
pharmaceutical^ acceptable diluent or carrier. 
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14. A polypeptide according to any one of claims 1 to 6 for use as a medicament. 

15. A polypeptide according to any one of claims 1 to 6 for use in the treatment of 
inflammatory or allergic diseases or tumour growth. 

16. A nucleotide sequence according to claim 7 for use in therapy. 

17. A nucleotide sequence according to claim 7 for use in the treatment of 
inflammatory or allergic diseases or tumour growth. 

18. An antibody or fragment thereof according to any one of claims 8 to 11 for use as a 
medicament. 

19. An antibody or fragment thereof according to any one of claims 8 to 11 for use in 
the treatment of inflammatory or allergic diseases or tumour growth. 

20. A ligand according to claim 12 for use as a medicament. 

21. A ligand according to claim 12 for use in the treatment of inflammatory or allergic 
diseases or tumour growth. 

22. A composition according to claim 13 for use as a medicament, 

23. A composition according to claim 13 for use in the treatment of inflammatory or 
allergic diseases or tumour growth. 

24. jUse of a polypeptide according to any one of claim 1 to 6 in the manufacture of a 
medicament for the treatment of inflammatory or allergic diseases or tumour growth.'/ 



WO 98/54209 



PCT/GB98/01572 



38 

25. Use of a nucleotide sequence according to claim 7 in the manufacture of a 
medicament for the treatment of inflammatory or allergic diseases or tumour growth. 

26. L ^Use of an antibody or' fragment thereof according to any one of claims 8 to 11 in 

the manufacture of a medicament for the treatment of inflammatory or allergic diseases or 

) 

tumour growth, / 

27. Use of a ligand according to claim 12 in the manufacture of a medicament for the 
treatment of inflammatory or allergic diseases or tumour growth. 

28. Use of a composition according to claim 13 in the manufacture of a medicament for 
the treatment of inflammatory or allergic diseases or tumour growth. 

29. A method of treatment for inflammatory or allergic diseases or tumour growth 
which comprises administering an effective dose of a polypeptide of claims 1 to 6; an 
antibody or fragment thereof of claims 8 to 11; a ligand of claim 12; or a composition of 
claim 13. 

30. A method of preparing a polypeptide according to any one of claims 1 to 6 which 
comprises the steps of: 

i) Ncc-Fmoc deprotection; 

ii) washing; 

iii) coupling of a single amino acid residue or amino acid mixtures; 

iv) washing; 

v) repeating until the desired polypeptide is constructed. 
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Figure 1 
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Figure 2 
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Figure 3 
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